A simplified convergence theory for Byzantine resilient stochastic gradient descent

نویسندگان

چکیده

In distributed learning, a central server trains model according to updates provided by nodes holding local data samples. the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for training such as stochastic gradient descent (SGD) fail converge. this paper, we present simplified convergence theory generic Resilient SGD method originally proposed Blanchard et al. (2017) [3]. Compared existing analysis, shown stationary point in expectation under assumptions on (possibly nonconvex) objective function and flexible gradients.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Byzantine Stochastic Gradient Descent

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...

متن کامل

Convergence of Stochastic Gradient Descent for PCA

We consider the problem of principal component analysis (PCA) in a streaming stochastic setting, where our goal is to find a direction of approximate maximal variance, based on a stream of i.i.d. data points in R. A simple and computationally cheap algorithm for this is stochastic gradient descent (SGD), which incrementally updates its estimate based on each new data point. However, due to the ...

متن کامل

Convergence Analysis of Gradient Descent Stochastic Algorithms

This paper proves convergence of a sample-path based stochastic gradient-descent algorithm for optimizing expected-value performance measures in discrete event systems. The algorithm uses increasing precision at successive iterations, and it moves against the direction of a generalized gradient of the computed sample performance function. Two convergence results are established: one, for the ca...

متن کامل

Quantized Stochastic Gradient Descent: Communication versus Convergence

Parallel implementations of stochastic gradient descent (SGD) have received signif1 icant research attention, thanks to excellent scalability properties of this algorithm, 2 and to its efficiency in the context of training deep neural networks. A fundamental 3 barrier for parallelizing large-scale SGD is the fact that the cost of communicat4 ing the gradient updates between nodes can be very la...

متن کامل

Convergence diagnostics for stochastic gradient descent with constant step size

Iterative procedures in stochastic optimization are typically comprised of a transient phase and a stationary phase. During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in a convergence region, commonly around a single point. In this paper, we develop a statistical diagnostic test to detect such phase transiti...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: EURO journal on computational optimization

سال: 2022

ISSN: ['2192-4406', '2192-4414']

DOI: https://doi.org/10.1016/j.ejco.2022.100038